88 research outputs found

    Recurrent mutations in the U2AF1 splicing factor in myelodysplastic syndromes

    Get PDF
    Myelodysplastic syndromes (MDS) are hematopoietic stem cell disorders that often progress to chemotherapy-resistant secondary acute myeloid leukemia (sAML). We used whole-genome sequencing to perform an unbiased comprehensive screen to discover the somatic mutations in a sample from an individual with sAML and genotyped the loci containing these mutations in the matched MDS sample. Here we show that a missense mutation affecting the serine at codon 34 (Ser34) in U2AF1 was recurrently present in 13 out of 150 (8.7%) subjects with de novo MDS, and we found suggestive evidence of an increased risk of progression to sAML associated with this mutation. U2AF1 is a U2 auxiliary factor protein that recognizes the AG splice acceptor dinucleotide at the 3' end of introns, and the alterations in U2AF1 are located in highly conserved zinc fingers of this protein. Mutant U2AF1 promotes enhanced splicing and exon skipping in reporter assays in vitro. This previously unidentified, recurrent mutation in U2AF1 implicates altered pre-mRNA splicing as a potential mechanism for MDS pathogenesis

    An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics

    Get PDF
    For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types

    Proteogenomic integration reveals therapeutic targets in breast cancer xenografts

    Get PDF
    Recent advances in mass spectrometry (MS) have enabled extensive analysis of cancer proteomes. Here, we employed quantitative proteomics to profile protein expression across 24 breast cancer patient-derived xenograft (PDX) models. Integrated proteogenomic analysis shows positive correlation between expression measurements from transcriptomic and proteomic analyses; further, gene expression-based intrinsic subtypes are largely re-capitulated using non-stromal protein markers. Proteogenomic analysis also validates a number of predicted genomic targets in multiple receptor tyrosine kinases. However, several protein/phosphoprotein events such as overexpression of AKT proteins and ARAF, BRAF, HSP90AB1 phosphosites are not readily explainable by genomic analysis, suggesting that druggable translational and/or post-translational regulatory events may be uniquely diagnosed by MS. Drug treatment experiments targeting HER2 and components of the PI3K pathway supported proteogenomic response predictions in seven xenograft models. Our study demonstrates that MS-based proteomics can identify therapeutic targets and highlights the potential of PDX drug response evaluation to annotate MS-based pathway activities

    Brain structural correlates of insomnia severity in 1053 individuals with major depressive disorder : results from the ENIGMA MDD Working Group

    Get PDF
    It has been difficult to find robust brain structural correlates of the overall severity of major depressive disorder (MDD). We hypothesized that specific symptoms may better reveal correlates and investigated this for the severity of insomnia, both a key symptom and a modifiable major risk factor of MDD. Cortical thickness, surface area and subcortical volumes were assessed from T1-weighted brain magnetic resonance imaging (MRI) scans of 1053 MDD patients (age range 13-79 years) from 15 cohorts within the ENIGMA MDD Working Group. Insomnia severity was measured by summing the insomnia items of the Hamilton Depression Rating Scale (HDRS). Symptom specificity was evaluated with correlates of overall depression severity. Disease specificity was evaluated in two independent samples comprising 2108 healthy controls, and in 260 clinical controls with bipolar disorder. Results showed that MDD patients with more severe insomnia had a smaller cortical surface area, mostly driven by the right insula, left inferior frontal gyrus pars triangularis, left frontal pole, right superior parietal cortex, right medial orbitofrontal cortex, and right supramarginal gyrus. Associations were specific for insomnia severity, and were not found for overall depression severity. Associations were also specific to MDD; healthy controls and clinical controls showed differential insomnia severity association profiles. The findings indicate that MDD patients with more severe insomnia show smaller surfaces in several frontoparietal cortical areas. While explained variance remains small, symptom-specific associations could bring us closer to clues on underlying biological phenomena of MDD

    Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel

    Get PDF
    A major use of the 1000 Genomes Project (1000GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants. © 2014 Macmillan Publishers Limited. All rights reserved

    Reproducibility in the absence of selective reporting : An illustration from large-scale brain asymmetry research

    Get PDF
    Altres ajuts: Max Planck Society (Germany).The problem of poor reproducibility of scientific findings has received much attention over recent years, in a variety of fields including psychology and neuroscience. The problem has been partly attributed to publication bias and unwanted practices such as p-hacking. Low statistical power in individual studies is also understood to be an important factor. In a recent multisite collaborative study, we mapped brain anatomical left-right asymmetries for regional measures of surface area and cortical thickness, in 99 MRI datasets from around the world, for a total of over 17,000 participants. In the present study, we revisited these hemispheric effects from the perspective of reproducibility. Within each dataset, we considered that an effect had been reproduced when it matched the meta-analytic effect from the 98 other datasets, in terms of effect direction and significance threshold. In this sense, the results within each dataset were viewed as coming from separate studies in an "ideal publishing environment," that is, free from selective reporting and p hacking. We found an average reproducibility rate of 63.2% (SD = 22.9%, min = 22.2%, max = 97.0%). As expected, reproducibility was higher for larger effects and in larger datasets. Reproducibility was not obviously related to the age of participants, scanner field strength, FreeSurfer software version, cortical regional measurement reliability, or regional size. These findings constitute an empirical illustration of reproducibility in the absence of publication bias or p hacking, when assessing realistic biological effects in heterogeneous neuroscience data, and given typically-used sample sizes

    Driver Fusions and Their Implications in the Development and Treatment of Human Cancers.

    Get PDF
    Gene fusions represent an important class of somatic alterations in cancer. We systematically investigated fusions in 9,624 tumors across 33 cancer types using multiple fusion calling tools. We identified a total of 25,664 fusions, with a 63% validation rate. Integration of gene expression, copy number, and fusion annotation data revealed that fusions involving oncogenes tend to exhibit increased expression, whereas fusions involving tumor suppressors have the opposite effect. For fusions involving kinases, we found 1,275 with an intact kinase domain, the proportion of which varied significantly across cancer types. Our study suggests that fusions drive the development of 16.5% of cancer cases and function as the sole driver in more than 1% of them. Finally, we identified druggable fusions involving genes such as TMPRSS2, RET, FGFR3, ALK, and ESR1 in 6.0% of cases, and we predicted immunogenic peptides, suggesting that fusions may provide leads for targeted drug and immune therapy

    Association of C-reactive protein with bacterial and respiratory syncytial virus-associated pneumonia among children aged <5 years in the PERCH study

    Get PDF
    Background. Lack of a gold standard for identifying bacterial and viral etiologies of pneumonia has limited evaluation of C-reactive protein (CRP) for identifying bacterial pneumonia. We evaluated the sensitivity and specificity of CRP for identifying bacterial vs respiratory syncytial virus (RSV) pneumonia in the Pneumonia Etiology Research for Child Health (PERCH) multicenter case-control study. Methods. We measured serum CRP levels in cases with World Health Organization-defined severe or very severe pneumonia and a subset of community controls. We evaluated the sensitivity and specificity of elevated CRP for "confirmed" bacterial pneumonia (positive blood culture or positive lung aspirate or pleural fluid culture or polymerase chain reaction [PCR]) compared to "RSV pneumonia" (nasopharyngeal/oropharyngeal or induced sputum PCR-positive without confirmed/suspected bacterial pneumonia). Receiver operating characteristic (ROC) curves were constructed to assess the performance of elevated CRP in distinguishing these cases. Results. Among 601 human immunodeficiency virus (HIV)-negative tested controls, 3% had CRP ≥40 mg/L. Among 119 HIVnegative cases with confirmed bacterial pneumonia, 77% had CRP ≥40 mg/L compared with 17% of 556 RSV pneumonia cases. The ROC analysis produced an area under the curve of 0.87, indicating very good discrimination; a cut-point of 37.1 mg/L best discriminated confirmed bacterial pneumonia (sensitivity 77%) from RSV pneumonia (specificity 82%). CRP ≥100 mg/L substantially improved specificity over CRP ≥40 mg/L, though at a loss to sensitivity. Conclusions. Elevated CRP was positively associated with confirmed bacterial pneumonia and negatively associated with RSV pneumonia in PERCH. CRP may be useful for distinguishing bacterial from RSV-associated pneumonia, although its role in discriminating against other respiratory viral-associated pneumonia needs further study

    Pathogenic Germline Variants in 10,389 Adult Cancers

    Get PDF
    We conducted the largest investigation of predisposition variants in cancer to date, discovering 853 pathogenic or likely pathogenic variants in 8% of 10,389 cases from 33 cancer types. Twenty-one genes showed single or cross-cancer associations, including novel associations of SDHA in melanoma and PALB2 in stomach adenocarcinoma. The 659 predisposition variants and 18 additional large deletions in tumor suppressors, including ATM, BRCA1, and NF1, showed low gene expression and frequent (43%) loss of heterozygosity or biallelic two-hit events. We also discovered 33 such variants in oncogenes, including missenses in MET, RET, and PTPN11 associated with high gene expression. We nominated 47 additional predisposition variants from prioritized VUSs supported by multiple evidences involving case-control frequency, loss of heterozygosity, expression effect, and co-localization with mutations and modified residues. Our integrative approach links rare predisposition variants to functional consequences, informing future guidelines of variant classification and germline genetic testing in cancer. A pan-cancer analysis identifies hundreds of predisposing germline variants

    Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic Pipelines

    Get PDF
    The Cancer Genome Atlas (TCGA) cancer genomics dataset includes over 10,000 tumor-normal exome pairs across 33 different cancer types, in total >400 TB of raw data files requiring analysis. Here we describe the Multi-Center Mutation Calling in Multiple Cancers project, our effort to generate a comprehensive encyclopedia of somatic mutation calls for the TCGA data to enable robust cross-tumor-type analyses. Our approach accounts for variance and batch effects introduced by the rapid advancement of DNA extraction, hybridization-capture, sequencing, and analysis methods over time. We present best practices for applying an ensemble of seven mutation-calling algorithms with scoring and artifact filtering. The dataset created by this analysis includes 3.5 million somatic variants and forms the basis for PanCan Atlas papers. The results have been made available to the research community along with the methods used to generate them. This project is the result of collaboration from a number of institutes and demonstrates how team science drives extremely large genomics projects
    • …
    corecore